Genomic and morphological data shed light on the complexities of shared ancestry between closely related duck species

Causes for genomic and morphological similarities among recently radiated species are often multifaceted and are further convoluted among species that readily interbreed. Here, we couple genomic and morphological trait comparisons to test the extent that ancestry and gene flow explain the retention of mallard-like traits within a sister species, the Mexican duck. First, we confirm that these taxa remain genetically structured, and that Mexican ducks exhibit an isolation-by-distance pattern. Despite the assumption of wide-spread hybridization, we found only a few late-stage hybrids, all from the southwestern USA. Next, assessing 23 morphological traits, we developed a genetically-vetted morphological key that is > 97% accurate in distinguishing across sex-age cohorts of Mexican ducks, mallards, and hybrids. During key development, we determined that 25% of genetically pure, immature male Mexican ducks of the northern population naturally displayed mallard-like traits in their formative plumage. In fact, applying this key to 55 museum specimens, we identified that only four of the 14 specimens originally classified as phenotypic hybrids were truly hybrids. We discuss how genomic and morphological comparisons shed light into the mechanism(s) underlying the evolution of complex phenotypic traits in recent radiations, and how misunderstanding the true morphological diversity within Mexican ducks resulted in taxonomic revisions that hindered conservation efforts.


Methods
Sampling, DNA extraction, and ddRAD-seq library preparation. We analyzed a total of 387 Mexican ducks, mallards, and their putative hybrids. In addition to previously published raw ddRAD sequences of Mexican ducks (N = 104; Lavretsky et al. 20 ), we filled in geographical gaps by sampling tissue or blood from individuals collected in the Mexican states of Chihuahua (N = 68) and Sinaloa (N = 18), and in the southwestern USA (N = 59) ( Fig. 1; https:// github. com/ jibro wn17/ MEDU_ Metad ata). In addition, a total of 138 wild (N = 76), domestic game-farm (N = 49), and feral Khaki Campbell (N = 13; i.e., a domestic breed known for their khakicolored plumage) mallards collected throughout North America were also included in analyses, which included newly collected and previously published data 11,20,25,32 (also see sample specifics in https:// github. com/ jibro wn17/ MEDU_ Metad ata). Note that game-farm mallards are domestic mallards being released on shooting preserves for hunting and/or dog training purposes and are now known to be the primary instigators of hybridization for wild populations of mallards and other mallard-like ducks 25 . Moreover, we used a group of feral Khaki Campbell mallards that were sampled alongside Mexican ducks and wild mallards as a genetic reference for other domestic mallards. Having wild and domestic mallards allowed us to determine which of these posed the highest risk for hybridization with Mexican ducks. Finally, two potential vagrant Mexican ducks collected in California were opportunistically sampled (Fig. 1A). Figures 1A, 4A and B have been derived by georeferencing and modifying publicly available USGS maps (https:// www. scien cebase. gov/ catal og/ item/ 4fb55 5ebe4 b04cb 93775 1db9) using ArcGIS Desktop 10.6 software (https:// suppo rt. esri. com/ en/ Produ cts/ Deskt op/ arcgi sdesk top/ arcmap/ 10-6-1).
This research was conducted under IACUC approval 967486-1 (University of Texas at El Paso) and USFWS scientific collecting permit MB11579C-3. All procedures involving animals were carried out in accordance with relevant guidelines and regulations.
Genomic DNA was extracted from blood or tissue using a DNeasy Blood & Tissue kit following the manufacturer's protocols (Qiagen, Valencia, CA, USA) for the newly collected 32 wild mallards and 146 putative Mexican ducks (see sample-specific SRA data in https:// github. com/ jibro wn17/ MEDU_ Metad ata). DNA quality was visually assessed on a 1% agarose gel to ensure high molecular weight bands, and quantified using a Qubit 3 Flourometer (Invitrogen, Carlsbad, CA, USA) to ensure a minimum concentration of 20 ng/μL. ddRAD-seq library preparation followed protocols outlined in DaCosta and Sorenson 37 (also see Lavretsky et al. 20 ). In short, genomic DNA from individual samples was enzymatically fragmented using SbfI and EcoRI restriction enzymes. Next, Illumina TruSeq compatible barcodes were ligated to each sample for future de-multiplexing. Libraries were quantified, pooled in equimolar volumes, and the multiplexed library sent to the University of Oregon  Fig. S3). Finally, (D) a haplotype network based on 628 sequenced base-pairs of the mitochondrial control region. Note that we identify mallards by origin (wild vs. domestic) and Mexican ducks by geographical location in PCA and mitochondrial haplotype network, whereas assignment probabilities are colored by the six genetic clusters as estimated with ADMIXTURE. www.nature.com/scientificreports/ Core Genomics Facility for 150 base-pair, single-end chemistry sequencing on an Illumina HiSeq 4000 (detailed methods can be found in Supplementary Document S1). Raw Illumina reads from previously and newly sequenced samples were de-multiplexed and processed using the computational pipeline described by DaCosta and Sorenson 37 (Python scripts available at http:// github. com/ BU-RAD-seq/ ddRAD-seq-Pipel ine; also see Lavretsky et al. 20 ). Loci were parsed into chromosomes by using BLASTN v.2 38 and mapped to a Mallard reference genome (Accession no. SS263068950-SS263191362) 39,40 . Detailed bioinformatics methods can be found in Supplementary Document S1. We also note that large sections of the Z-sex chromosome was found to be under directional selection in mallards, and was shown to have distinct population structure due to its unique evolutionary history as compared to autosomal chromosomes among North American mallard-like taxa 11,20 . Mitochondrial DNA. For the same 174 newly collected samples, primers L78 and H774 were used to PCR amplify and sequence 655 base pairs of the mtDNA control region 41,42 following Sanger Sequencing methods described in Lavretsky et al. 24  Nuclear population structure and estimates of molecular diversity. Nuclear population structure was based on independent bi-allelic ddRAD-seq autosomal single nucleotide polymorphisms (SNPs) and without using a priori assignment of individuals to populations or species. Bi-allelic SNPs were extracted from a concatenated fasta file of ddRAD-seq autosomal loci using a custom python script in plink format (i.e., ped & map files) 45 . Following, PLINK v1.07 46 was used to filter for singletons (i.e., minimum allele frequency (-maf 0.005)), any SNP missing ≥ 20% of data across samples (-geno 0.2), as well as any SNPs found to be in linkage disequilibrium (LD) (-indep-pairwise 2 1 0.5). One of the two SNPs was randomly excluded if an LD correlation factor (r2) > 0.5 was obtained.

Scientific Reports
First, a Principal Components Analysis (PCA) was done in the R package Adegenet 47 with the 'dudi.pca' function. Next, we used programs ADMIXTURE v. 1.3 48,49 and fineRADstructure 50 , with the latter being shown to better differentiate between patterns of shared ancestry (i.e., ILS) and recent hybridization in mallard complex taxa 11 . Specifically, assignment probabilities based on major allele frequencies such as in ADMIXTURE can be complicated by close genetic relationship, including for taxa that show patterns of isolation-by-distance as within Mexican ducks 20 . Rather, fineRADstructure co-ancestry matrices are calculated based on the rarest SNPs that contributes the most information, which not only allows it to account for linkage among individual ddRAD loci, but it is better suited for studies of incipient species with very little genetic differentiation 11,50 . For these reasons, we obtained both individual assignment and co-ancestry assignments across samples in order to better differentiate shared versus introgressed ancestry. Details about dataset filtering and settings used in ADMIXTURE and fineRADstructure analyses can be found in Supplementary Document S1.
Finally, composite pairwise estimates of relative divergence (ΦST) and nucleotide diversity (π) for mtDNA, as well as Autosomal and Z-chromosome ddRAD-seq loci were calculated across species as well as between Mexican duck sampling groups using the package PopGenome 51 in the program R.
Morphological assessment, assignment, and key development. A total of 302 birds with available structural and plumage trait data were examined, that included 194 genetically vetted mallards, Mexican ducks, and hybrids (see specifics in https:// github. com/ jibro wn17/ MEDU_ Metad ata). Due to a lack of genetically vetted mallards within our original molecular dataset and the apparent issues of feral game-farm mallard introgression among wild populations in eastern North America 25 , mallard morphological diversity was largely established with plumage information from an additional 108 mallards collected across their primary breeding range in the central prairies of Canada and the United States (i.e., the prairie pothole region). In addition to comparing samples from areas where the two species do not come into contact and thus may not interbreed, these mallards can still be considered to be of genetically pure North American wild ancestry (Fig. 1A). Plumage was assessed from both photos, frozen birds, and study skin specimens of the Mexican ducks, mallards, and putative hybrids housed at the UTEP Biodiversity collections (UTEPBC), Smithsonian National Museum of Natural History (SNMNH), American Museum of Natural History (AMNH) and the Museum of Wildlife and Fish Biology at University of California Davis (MWFB). Age and sex were assigned across samples using a molt cycle-based aging system of Pyle 52 . Specifically, a bird was determined to be immature if they were in pre-formative molt, formative plumage, first pre-alternate molt, or first alternate plumage, whereas adult birds were those in prebasic molt, definitive basic plumage, pre-alternate molt, or definitive alternate plumage. Morphological characterization of pure Mexican duck, pure mallard, or hybrids were based on categorization of each samples genetic assignment attained from our ddRAD-seq autosomal population structure analyses.
Morphological assessments largely followed procedures in Scott and Reynolds 30 and Bielefeld et al. 53 , and key development followed methods developed for the Hawaiian duck (A. wyvilliana, A. Engilis and J. Eadie, University of California-Davis, unpublished data) and mottled duck (A. fulvigula) 53 . Data was collected on an initial suite of six structural and 16 plumage traits, as well as mass ( www.nature.com/scientificreports/ attained by running an ANOVA (p < 0.001). For plumage, we first created categories within each of the 16 traits that would explain all possible phenotypes observed during visual assessment of photos and study skins across specimens (Table 1). Specifically, each trait was assigned an ordinal score representing whether it exhibited the Mexican duck (null = 0) or mallard (alternative = 1) form of the trait. Non-binary traits were given additional ordinal scores. For example, all adult male Mexican ducks exhibit straight central tail feathers, whereas the amount of tail-feather curl varied among male mallards. Thus, we surmised that a hybrid male would have at least some curl to its central tail feather. In this case, the tail curl trait was assigned 3 possible ordinal scores that represented pure Mexican duck (0 = no tail curl), mallard (2 = full tail curl), and intermediate (1 = partial curl) ( Table 1). Note, that the initial categorization of each trait was based on the variability identified by assessing a subset of samples with photos from Lavretsky et al. 20 , study skins from various museums, and stock photos from books 23 . Without a priori knowledge of the phenotypic variability expressed in hybrids, many traits were given multiple ordinal scores to capture potential plasticity of traits that may arise from ancestry and/or hybridization (e.g., percent green in head; Table 1). Next, we visually assigned an ordinal score for all 16 plumage traits across samples, and then (1) examined general structure among samples by analyzing all possible plumage traits with Table 1. Plumage (N = 16), structural + mass (N = 7) traits assessed between Mexican ducks (MEDU), mallards (MALL), and hybrids (HYB). All plumage traits found to be diagnostic are bolded, as well as the sex (Male = M; Female = F) and age (Immature = I; Adult = A) cohort for which the trait was diagnostic are identified. Assignment criteria, including species association are provided for each plumage. Finally, pair-wise comparison for which structural traits were significantly different (p < 0.01) are also provided-any unlisted comparisons are statistically insignificant.  www.nature.com/scientificreports/ a priori species information in a PCA using the "prcomp" command within the R base package (R Team 2020), and (2) determined which set of traits best explained the variance across age-sex cohorts of Mexican ducks, mallards, and hybrids using a linear discriminant analysis (LDA) as implemented in the R MASS package 54 . The summation across ordinal scores provided a final plumage score (PS) assigned to each sample. Finally, the categorization of sex-age cohorts of genetically pure Mexican duck, pure mallard, and hybrids established a range of PS values for each group; the strength of the assessed traits were determined based on the concordance between PS and genetic assignments across samples. Finally, we assessed 55 historical museum skins of putative Mexican ducks and hybrids deposited at the Museum of Southwestern Biology (MSB), and originally collected and classified by Huey 28 , and later by Hubbard 27 to determine the biological and taxonomic status of the Mexican duck.

Results
After filtering, we recovered a total of 3189 ddRAD-seq loci, with 3015 (270,895 (bp)) and 174 (15,609 bp) loci assigned to the autosomal and Z-sex chromosomes, respectively. Our dataset consisted of an average median sequencing depth of 128 (range = 26-687) reads per locus per sample, and an average of 97% of alleles present per locus. Finally, 625 bp of overlapping mtDNA control region was sequenced for a total of 376 samples (of 387; https:// github. com/ jibro wn17/ MEDU_ Metad ata).
Mitochondrial haplotype structure. Haplotype tree reconstruction using the mtDNA control region revealed the two known Old World (OW) A and New World (NW) B haplogroups (Fig. 1D) (Fig. 1D). Based on a simple tally, wild mallards had the most unique haplotypes (N = 43), followed by Mexican ducks from Southwestern USA, Chihuahua, and then those for more southern Mexico, whereas those from the western populations of Sonora and Sinaloa both carried only five unique haplotypes.  Materials Fig. S1A).
Next, analyzing the complete 387 sample set, co-ancestry analyses identified 29 samples making up 12 sibling groups (see Supplementary Materials Fig. S2 & https:// github. com/ jibro wn17/ MEDU_ Metad ata). All putative sibling groups indeed were caught at the same time and location, were often flightless juvenile individuals, and all carried the same respective mtDNA haplotype. For example, three northern Mexican ducks that were all flightless juveniles sampled at the same time from the same location (Bosque del Apache, New Mexico) were identified as siblings as they had higher-than-average estimates of individual co-ancestry with one another ( Supplementary  Materials Fig. S2). Given that the inclusion of full siblings strongly biased PCA and ADMIXTURE analyses, all but one representative per sibling group was subsequently excluded in final population structure analyses. In the end, PCA, ADMIXTURE, and fineRADstructure analyses were based on a total of 12,696 (of 13,835) independent bi-allelic ddRAD-seq autosomal SNPs for 370 samples.
We visualized the first three principal components in our PCA, and individual assignment probabilities estimated in ADMIXTURE for K = 6 ( Fig. 1B; Supplementary Materials Fig. S2). Despite recovering an optimum K of three, where two groups of domestic mallards were distinguished from Mexican ducks and wild mallards, additional structure was recovered by incrementally increasing the K to six (Supplementary Materials Fig. S2C). Given that we evaluated assignment probabilities at a population K value of six, bootstrap values and associated confidence intervals per sample were based on the sum assignment probability across mallard types (i.e., wild + game-farm + feral park), and the confidence intervals as the square root sum of each standard error squared 58 . Across all three analyses, we found wild mallards and Mexican ducks to be more closely related to one another than either were to domestic mallards, but that Mexican ducks and mallards remained In fact, < 1% of variation was explained by any single principal component ( Fig. 1B; Supplementary Materials  Fig. S2A-B). Within Mexican ducks, we recovered the three genetically unique populations: (1) a northern population comprised of the Mexican state of Chihuahua and Mexican ducks from the southwestern USA, (2) a western-coast cluster comprised of samples from the Mexican states of Sonora and Sinaloa, and (3) an interior Mexico population comprising samples from the Mexican states of Durango, Zacatecas, Guanajuato, Mexico, and Puebla (Fig. 1B). fineRADstructure showed a high degree of individual co-ancestry within these groups, as well as average co-ancestry to wild mallards that was on average higher than that recovered between domestic and wild mallards ( Fig. 2; Supplementary Materials Fig. S3). Additionally, we find that western-coast Mexican ducks show a higher average co-ancestry with Mexican ducks from Chihuahua. Domestic mallards showed the highest levels of co-ancestry between individual samples, which is consistent with a history of domestication 25 . Finally, whereas one of the two putative Mexican ducks from California was simply a female mallard, the other was indeed a genetically pure Mexican duck that clustered with other Mexican ducks from Chihuahua based on both ADMIXTURE assignment probabilities and fineRADstructure co-ancestry estimates.
Hybrid identification. First, ADMIXTURE identified a total of 53 samples that had substantial assignment (> 10%) to both Mexican ducks and the sum of all mallard types (i.e., domestic and wild). The majority of these were collected in southwestern USA (39/53, 74%), followed by Chihuahua (10/53, 19%), the western-coast regions of Sonora and Sinaloa (3/53, 6%), and a single sample from interior Mexico (1/53, 2%). In contrast, analyzing the full sample set with fineRADstructure identified a total of 25 samples, all of which were from the southwestern USA, with higher than average co-ancestry to both Mexican duck and mallard (wild or domestic) clusters. These samples were uniquely clustered between Mexican ducks and mallards and had a higher affinity to mallards (i.e., see dendrogram in Fig. 2; Supplementary Materials Fig. S3), suggesting these are indeed recent hybrids. Of the 25 samples, four had higher than average co-ancestry with feral Khaki Campbell mallards (Fig. 2). We note that an additional five samples were inferred to be hybrids through sibling relationships when analyzing the full dataset ( Supplementary Materials Fig. S3). Earlier caution on over-interpreting ADMIXTURE results (see above) is stressed as (1) mallards and Mexican ducks have a very close genetic relationship, (2) there is fine geographical-based structure within Mexican ducks, and (3) there have been recent founder events in western-coast states of Mexico (i.e., Sonora and Sinaloa). All of these factors can result in spurious assignments and forced us to analyze ADMIXTURE at a higher than optimum K (Supplementary Materials Fig. S1C). Additionally, we see the highest standard errors among Mexican ducks comprising the Northern genetic cluster, with the majority of them overlapping zero. Moreover, we found no association between an individual's PS value and mallard assignment probability as determined by ADMIXTURE, which included individuals having ≤ 30% ADMIXTURE assignment to mallard, yet Mexican duck-equivalent PS values. Similarly, we found that individuals with > 50% ADMIXTURE assignment to mallard had mallardequivalent PS values. Thus, in the case of the Mexican ducks, ADMIXTURE appears to be confounded by retained ancestry and is over-representing admixture ( Fig. 1C; Supplementary Materials Fig. S2C). We conclude that fineRADstructure 50 appears to be more reliable when dealing with recently diverged taxa that have high levels of shared ancestry, as well as with fine-scale geographical structure (e.g., isolation-by-distance among Mexican ducks; also see Lavretsky et al. 11 ).

Morphological assessment & assignment.
For mass and the six structural traits evaluated here, we found at least one statistical difference among evaluated groups in only three traits (Supplementary Materials Fig. S4). First, tarsus length was statistically significant between Mallards and Mexican ducks when comparing immature females, as well as adult females and males. Mass was statistically different among adult male hybrids, mallards, and Mexican ducks, whereas adult female hybrids or Mexican ducks were statistically different from mallards. Finally, adult male Mexican ducks were statistically different from hybrids and mallards. Note that immature male Mexican ducks, mallards, and hybrids were not statistically different in any of the six structural traits. Such variance and increased missingness of structural traits caused errors in downstream analyses, and so structural traits were excluded in our PCA and LDA.
Our PCA showed clear separation between Mexican ducks and mallards based on the 16 assessed plumage traits ( Fig. 3A; Supplementary Materials Fig. S5). Of the 25 genetic hybrids collected here, only 13 included morphological data; one and four clustered with Mexican ducks and mallards, respectively, while the remaining eight were recovered in intermediate space (Fig. 3B-E; also see sample specific scores in https:// github. com/ jibro wn17/ MEDU_ Metad ata). Next, LDA analyses recovered different optimum combinations of plumage traits as diagnostic for each age-sex cohort of Mexican ducks, mallards, and hybrids (Table 1; Supplementary Materials  Fig. S6), and thus, a specific range of PS values for each comparison (Fig. 3B-E). In general, Mexican ducks were assigned PS values of ≤ 4 regardless of sex-age cohort (Fig. 3). Applying the developed keys (Supplementary Materials Document S2 & S3), we attained 100% and 97% accuracy in assigning across genetically-vetted sex-age cohorts of pure female and male Mexican ducks, respectively (Fig. 3). The four Mexican duck museum specimens that were identified with hybrid-equivalent PS values included two immature and one adult male Mexican duck (Fig. 3B-E). Importantly, however, only one adult female grouped with Mexican ducks, with the remaining 92% assigned as hybrid or mallard (Fig. 3B-E). The inability to distinguish hybrids and mallards may be due to the general lack of Mexican duck x mallard hybrids in our dataset 53 . Nevertheless, the developed key attained our ultimate goal to distinguish pure Mexican ducks from all others (Plumage Keys can be found in Supplementary Materials Document S2 & S3).
Finally, we applied respective sex-age cohort developed keys to the 55 historical specimens that originally included 41 and 14 putative Mexican ducks and Mexican duck x mallard hybrids 27 , respectively (Fig. 3) Fig. S3).The level of recent coancestry is color coded from low (yellow) to high (blue). We color code mallards by origin (wild vs .domestic) and Mexican ducks by geographical location, as well as identify Mexican duck x (wild/feral) mallard hybrids and wild x game-farm mallard hybrids. www.nature.com/scientificreports/ duck (Fig. 3C), two and three were immature (Fig. 3D) and adult (Fig. 3E) male Mexican ducks, respectively. Next, only 4 of the 14 (28.6%) original historical specimens identified as hybrids were considered as such when assessed with our keys. Of the remaining ten specimens, eight and two had PS values equivalent to pure Mexican duck or mallard, respectively ( Fig. 3B-E). In the end, we find a total of ten historical samples that had hybrid equivalent PS scores, and determine that original assessments were highly inaccurate in hybrid identification.

Discussion
Increased genomic accessibility for non-model systems has allowed us to uncover evolutionary histories and subtle population structure within and between previously indistinguishable incipient forms 11 . Additionally, these methods permit for formal testing between potential causes of genetic and phenotypic trait retention among closely related taxa 59,60 . Whether ancestry, hybridization, or a combination of the two, cause retained genetic and morphological diversity among closely related species remains inferential unless genetically vetted. These questions are particularly complicated when only a portion of a population displays ancestral morphological traits, as the most parsimonious conclusion in such scenarios is to assume these are hybrids. Alternatively, such a scenario could result from stochastic retention of genetic variation and associated phenotypic traits, which have yet to be sorted among the incipient forms. For example, the clinal-like expression of mallard-like traits among Mexican ducks was hypothesized to be the result of extensive hybridization, to the point that Mexican ducks in the northern part of their range were thought to be a hybrid swarm 27,28,30,61 . However, genomic data presented here no longer support this conclusion. Rather, we provide strong evidence across the geographical range of the Mexican duck that hybrids are confined to southwestern USA, largely in human-dominated areas, and that formative males from this region naturally display a high proportion of mallard traits that appear to be subsequently lost as adults. Through our landscape level analysis we conclude that while Hubbard 27 correctly identified the clinal-like variability in the display of mallard phenotypes across the Mexican duck's range, he incorrectly concluded that it was due to high levels of hybridization with mallards. In fact, when applying our optimized plumage keys to historical samples from Hubbard 27 , we determined that 20% of immature males, 25% of adult males, and at least 57% of the hybrids were incorrectly identified (Fig. 3). Once re-evaluated, these historical samples show an almost identical distribution of PS values as samples collected here in the same www.nature.com/scientificreports/ respective regions (Fig. 4B). Thus, we conclude that the presence of mallard-derived genetic and phenotypic traits in Mexican ducks is the result of shared mallard ancestry and incomplete lineage sorting within Mexican ducks. Our study exemplifies the intricacies of species divergence, as well as emphasizes the importance of using genomic and morphological data jointly when attempting to disentangle ancestry and hybridization as a cause for trait (genetic and/or morphological) similarities among closely related taxa. More generally, in concordance with evidence from Florida mottled ducks (Anas fulvigula) 53 , American black ducks (Anas rubripes) 62 , and Hawaiian ducks 63 , our results continue to support the hypothesis that the wild mallard was the ancestor to Mexican ducks and all North American monochromatic duck species 11,25,64 . While we note that Hawaiian ducks represent a unique case of hybrid speciation 35,63 , the evolution of mainland North American monochromatic species has likely resulted from the divergence of isolated pockets of mallards 25,32 . These kind of sequential divergence events of mallard-like species throughout North America, which have occurred over the 500,000 years, would explain the significant discord in genetic and phenotypic divergence found within these species 20,24,25 . Our study demonstrates the need for a holistic approach to truly understand such complex evolutionary histories as observed between Mexican ducks and wild mallards, as a single phenotypic or genetic marker type may yield spurious conclusions, which can have important implications for their conservation. For the Mexican duck, misunderstanding the cause for mallard-like phenotypes led to decades of taxonomic revisions and the belief that they were largely doomed due to introgressive hybridization 27,30,65 . Population structure of and shared phenotypic expression in Mexican ducks explained by retained wild mallard ancestry and not hybridization. In addition to providing the most up-to-date structural and plumage trait descriptions for the Mexican duck (Table 1), we used genetically vetted individuals to shed light on age-sex cohort variability of traits (Fig. 4B). Whereas adult male, adult female, and immature female Mexican ducks showed similar plumage patterns across their entire range, immature male Mexican ducks exhibited much greater variation. Specifically, we find that 25% (14 of 56) of genetically pure immature male Mexican ducks from the northern (~ 30%; 10 of 33) and western-coast (~ 17%; 4 of 23) genetic clusters innately express a higher proportion of mallard-like traits as compared to those from the interior Mexican ducks (i.e., 0%; Fig. 4B). In short, immature males can have all or a combination of ≤ 25% green in the head, unique tidal back patterns, black in the rump, vermiculation in the body, variable dark vermiculation in the under-tail feathers, and elevated tail curls (Table 1). Despite the problems that increased variability might cause in distinguishing hybrids, our plumage key for immature males resulted in 98% (179 of 182) assignment accuracy (Fig. 3D). In the end, we developed plumage keys for each age-sex cohort of Mexican ducks that can confidently (< 3% error rate) be applied to distinguish pure Mexican ducks from mallards/hybrids. While future work will require additional comparisons to determine whether modifications are needed, our efforts here resulted in the development of a plumage key that standardizes field identification of Mexican ducks (Supplementary Materials Documents S2 & S3), providing an important resource for citizen scientists and wildlife managers. www.nature.com/scientificreports/ Although the evolution of dichromatism is often believed to be a derived state, an assessment of 977 species of birds suggests that there are many cases where dichromatism has been lost in favor of monochromatism 66 . However, in cases where dichromatism is lost, the underlying genetic diversity remains 66,67 , as dimorphic traits become controlled via modifiers (e.g., modifier alleles, steroids) rather than direct molecular changes [67][68][69] . In fact, the seasonal expression of dichromatic plumage in waterfowl are known to be estrogen-dependent 70 , where colorful plumage develops in the absence of testosterone/estrogen or luteinizing hormones [70][71][72][73] . Thus, monochromatic mallard-like ducks of North America that retain ancestrally derived mallard traits may control the expression of such plumage-linked genes through consistently elevated levels of testosterone or estrogen, which would induce monochromatic plumage year-round 74 . Regarding immature males, we hypothesize that unsuitable levels of testosterone prior to their pre-formative molt causes the partial expression of mallard-like traits in their first year. Similarly, older male and female monochromatic mallard-like ducks are known to regularly express male mallard-like plumage traits [67][68][69] , due to no longer being able to properly regulate hormone expression; this would explain the one adult male Mexican duck in our dataset that was identified as genetically pure, yet had a hybrid equivalent PS value (Fig. 3E). In general, New World mallard complex species provide a unique set of incipient forms that could act as a natural study system to understand the mechanisms underlying the evolution of complex phenotypic traits during the early stages of speciation as well as how expression of these traits is controlled during molt cycles.
Low levels and confined Mexican duck x mallard hybridization. Although Mexican ducks have always been thought to be at risk of genetic swamping or extinction 27,30,65 , we contend that this has likely never been the case. We recovered few hybrids, all of which were confined to the northern part of the Mexican duck's range (Figs. 2, 3 and 4). As predicted, we identified genetic hybrids primarily where Mexican ducks and mallards (wild and/or domestic) geographically co-occur during the breeding season. In fact, three of the four historical samples that possessed hybrid equivalent PS scores were found to be < 50 miles from large metropolitan areas. Thus, we caution that our results may in fact be overestimating hybridization rates due to the number of samples collected in more urban areas, where mallards (wild and feral) often remain year-round. While hybridization does occur, we conclude that it is relatively rare and largely confined to the southwestern USA (Fig. 4). Given that past taxonomic uncertainty was largely based on high rates of interbreeding with mallards, these results further support the recent taxonomic elevation of the Mexican duck to full species status 26 .
While we conclude that hybridization with mallards is currently not a major conservation concern, continued monitoring where locally breeding mallards (wild or domestic) occur is necessary. It will be particularly important to understand what proportion of hybridization is due to interactions with domestic mallards, as we provide evidence that at least four (of 25) putative hybrids were the result of pairings between Mexican ducks and feral park ducks (i.e., Khaki Campbell mallards). Additionally, we found three Mexican ducks carrying OW A haplotypes (Fig. 1D), which could only occur through a male Mexican duck and female domestic mallard pairing 25 . As domestic and not wild mallards continue to be found as the primary instigator of hybridization events among mallard complex taxa (Fig. 4A) 11,25,35 , educating the general public regarding the conservation consequences of domestic-lineage waterfowl releases is increasingly important.
Mexican ducks evolved through sequential founder events. Landscape-level genetic sampling not only supports the three distinct Mexican duck genetic clusters previously reported by Lavretsky et al. 20 , but also fills in geographical gaps to expand our understanding of their population structure across the landscape (Figs. 1, 2 and 4A). In general, the fine-scale population structure recovered with nuclear DNA and mtDNA support recent findings that Mexican ducks evolved through sequential founder events 32 . Given that samples comprising the northern genetic cluster of Mexican ducks harbor all major mtDNA haplotypes (Fig. 1D) and are the combination of all nuclear variation found in other groups, we hypothesize that this is the ancestral population for all Mexican ducks, acting as the source for subsequent founder events into interior Mexico (Figs. 1C and 4A). In fact, this aligns with demographic and genotype-environment association models from Brown et al. 32 , which showed a cyclical pattern of population growth and contraction that generally corresponded with glacial and inter-glacial cycles, respectively 32 . Next, while we hypothesized that Mexican ducks from Sinaloa would be the natural extension of the closest interior populations, we actually find that both Sinaloa and Sonoran Mexican ducks are most similar to one another (Fig. 1D). Low mtDNA haplotype variation and strong partitioning in nuclear markers with other clusters is consistent with reports suggesting that Mexican ducks have only recently (i.e., ~ 1990) colonized Mexico's western-coasts, likely from Chihuahua given the haplotype sharing between these groups (Figs. 1C, 2, and 4A) 20,75 . This westward expansion was likely in response to novel water availability due to the build-up of large dams, reservoirs, and canal systems for the rapid conversion of dry upland areas to intensive agriculture areas over the last 50 years in Mexico 76 .
Finally, we report for the first time a genetically pure Mexican duck collected in the Grasslands Area of Merced County, California; and while recently, an increasing number of Mexican ducks have been reported in California 77 , this is the furthest north this species has been genetically identified (Fig. 1). Surprisingly, while this bird was expected to be a vagrant from Mexico's western-coast due to geographic proximity, this Mexican duck shared a mtDNA haplotype with and, based on nuclear loci, clustered with Chihuahuan samples (Figs. 1 and 2). Thus, this sample demonstrates that large-scale dispersal of Mexican ducks from their interior range is possible. While fine-scale population structure suggests that inter-population dispersal is infrequent (also see Lavretsky et al. 20 ), the potential for and frequency of such migratory events to establish new populations outside of their current range remains unknown; in fact, it is possible that "vagrant" dispersal events may be how Mexican ducks expanded through time 78 . Finally, in addition to whole genome data, studies of Mexican duck ecology, short-and